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Abstract 


The genus Claviceps has been known for centuries as an economically important fungal genus for pharmacology and agricultural 
research. Only recently have researchers begun to unravel the evolutionary history of the genus, with origins in South America and 
classification of four distinct sections through ecological, morphological, and metabolic features (Claviceps sects. Citrinae, 
Paspalorum, Pusillae, and Claviceps). The first three sections are additionally characterized by narrow host range, whereas section 
Claviceps is considered evolutionarily more successful and adaptable as it has the largest host range and biogeographical distribution. 
However, the reasons for this success and adaptability remain unclear. Our study elucidates factors influencing adaptability by 
sequencing and annotating 50 Claviceps genomes, representing 21 species, for a comprehensive comparison of genome architec- 
ture and plasticity in relation to host range potential. Our results show the trajectory from specialized genomes (sects. Citrinae and 
Paspalorum) toward adaptive genomes (sects. Pusillae and Claviceps) through colocalization of transposable elements around 
predicted effectors and a putative loss of repeat-induced point mutation resulting in unconstrained tandem gene duplication 
coinciding with increased host range potential and speciation. Alterations of genomic architecture and plasticity can substantially 
influence and shape the evolutionary trajectory of fungal pathogens and their adaptability. Furthermore, our study provides a large 
increase in available genomic resources to propel future studies of Claviceps in pharmacology and agricultural research, as well as, 
research into deeper understanding of the evolution of adaptable plant pathogens. 
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Introduction Adaptation and diversification of fungal species can be medi- 


Fungi, particularly phytopathogenic species, are increasingly ated by changes in genome architecture and plasticity, such as 


being used to gain insight into the evolution of eukaryotic genome size, transposable element (TE) content, localization 
organisms, due to their adaptive nature and unique genome of TEs to specific genes, genome compartmentalization, gene 
structures (Gladieux et al. 2014; Dong et al. 2015). duplication rates, recombination rates, and presence/absence 

polymorphism of virulence factors (Dong et al. 2015; Moller 
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Significance 


Lack of genomic data for the Claviceps genus has hampered the ability to identify factors influencing the adaptation of 
Claviceps species and mechanisms associated with the broad host range of some species. Our analysis reveals the 
trajectory from specialized genomes toward adaptive genomes through a variety of genomic mechanisms which 
coincided with increases in host range potential. These results demonstrate a clear example of how genomic alter- 
ations can influence and shape the evolutionary trajectory of fungal pathogens in association with host range. 


and Stukenbrock 2017). The presence or absence of repeat- 
induced point (RIP) mutation is also an important mechanism 
for fungal genome evolution, as RIP works on a genomewide 
scale to silence TEs and duplicated genes, which can also 
“leak” onto neighboring genes (Galagan et al. 2003; 
Galagan and Selker 2004; Raffaele and Kamoun 2012; 
Urguhart et al. 2018; Mdller and Stukenbrock 2017). It is 
becoming increasingly evident that variations in these factors 
can be used to classify genomes as a one speed (one com- 
partment), such as the powdery mildew fungi Blumeria gra- 
minis f.sp. hordei and f.sp_ tritici, two speed (two 
compartments), such as the late blight pathogen 
Phytophthora infestans, or multispeed (multicompartment) 
such as the multihost pathogen Fusarium oxysporum (Dong 
et al. 2015; Frantzeskakis et al. 2019). These different 
“speeds” are characterized by their potential adaptability 
such that one-speed genomes are often considered less 
adaptable, whereas two-speed and multispeed genomes 
are often considered more adaptable (Dong et al. 2015; 
Frantzeskakis et al. 2019; Moller and Stukenbrock 2019). 
The ergot fungi of the genus Claviceps (Ascomycota, 
Hypocreales) are biotrophic species that share a specialized 
ovarian-specific nonsystemic parasitic lifestyle with their grass 
hosts (Pichova et al. 2018). Infections are fully restricted to 
individual unpollinated ovaries (Tudzynski and Scheffer 2004), 
and the fungus actively manages to maintain host cell viability 
to obtain nutrients from living tissue through a complex cross- 
talk of genes related to pathogenesis, such as secreted effec- 
tors, secondary metabolites, or cytokinin production (Hinsch 
et al. 2015, 2016; Oeser et al. 2017; Kind, Schurack, et al. 
2018; Kind, Hinsch, et al. 2018). Species of Claviceps are most 
notably known for their production of toxic alkaloids and sec- 
ondary metabolites but are also known for their expansive 
host range and negative impact on global cereal crop produc- 
tion and livestock farming. These negative effects on human 
and livestock health are the primary reason Claviceps species 
are referred to as plant pathogens. However, under the light 
of coevolution with their grass hosts, some Claviceps species 
are considered conditional defensive mutualists with their 
hosts as they prevent herbivory and can improve host fitness 
(Raybould et al. 1998; Fisher et al. 2007; Wali et al. 2013). 
The genus Claviceps contains 59 species divided into four 
sections as follows: Claviceps, Pusillae, Citrinae, and 
Paspalorum (Pichova et al. 2018). It was postulated that 


sections Citrinae and Paspalorum originated in South 
America, whereas section Pusillae experienced speciation 
throughout the Eocene, Oligocene, and Miocene as these 
species encountered newly emergent PACMAD warm- 
season grasses (subfamilies Panicoideae, Aristidoideae, 
Chloridoideae, Micrairoideae, Arundinoideae, and 
Danthonioideae) when an ancestral strain was transferred 
from South America to Africa (Pichova et al. 2018). In con- 
trast, the crown node of section Claviceps is estimated at 20.4 
Ma and was followed by a radiation of the section corre- 
sponding to a host jump from = ancestral sedges 
(Cyperaceae) to the Bamboo, Oryzoideae, Pooideae (BOP) 
clade (cool-season grasses; subfamilies Bambusoideae, 
Oryzoideae [syn: Ehrhartoideae]; Soreng et al. 2017, 
Pooideae) in North America (Bouchenak-Khelladi et al. 
2010; Pichova et al. 2018). Section Claviceps has the largest 
host range with C. purpurea sensu stricto (s.s.) having been 
reported on up to 400 different species in clade BOP 
(Alderman et al. 2004, Pichova et al. 2018) across six tribes 
and retains the ability to infect sedges (Cyperaceae) 
Vungehtlsing and Tudzynski 1997). In contrast, section 
Pusillae is specialized to the tribes Paniceae and 
Andropogoneae, and sections Citrinae and Paspalorum only 
infect members of tribe Paspaleae and tribe Cynodonteae, 
respectively (Pichova et al. 2018). The shared specialized in- 
fection life cycle of the Claviceps genus, the drastic differences 
in host range potential of different species, and geographic 
distribution represent a unique system to study the evolution 
and host adaptation of eukaryotic organisms. 

Despite their ecological and agriculture importance, little is 
known about the evolution and genomic architecture of these 
important fungal species in comparison with other cereal 
pathogens such as species in the genera Puccinia (Cantu 
et al. 2013; Kiran et al. 2016, 2017), Zymoseptoria (Estep 
et al. 2015; Grandaubert et al. 2015, 2019; Poppe et al. 
2015; Testa, Oliver et al. 2015; Wu et al. 2017; 
Stukenbrock and Dutheil 2018), or Fusarium (Kvas et al. 
2009; Ma et al. 2010; Rep and Kistler 2010; Watanabe 
et al. 2011; Sperschneider et al. 2015). Unfortunately, the 
lack of genome data for the Claviceps genus has hampered 
our ability to complete comparative analyses to identify fac- 
tors that are influencing the adaptation of Claviceps species 
across the four sections in the genus, and the mechanisms by 
which species of section Claviceps have adapted to such a 





2 Genome Biol. Evol. 13(2) doi:10.1093/gbe/evaa267 Advance Access publication 29 January 2021 


Zzoz Aueniqe-] 19 uo 1senB Aq spzez19/z9zeene/z/¢ |. /ejolue/eq6/woo'dnosiwepece//:sdyjy Wo pepeojumoqg 


Whole-Genome Comparisons of Ergot Fungi 


GBE 





broad host range, in comparison with the other three sec- 
tions. Here we present the sequences and annotations of 
50 Claviceps genomes, representing 19 species, for a compre- 
hensive comparison of the genus to understand evolution 
within the genus Claviceps by characterizing the genomic 
plasticity and architecture in relation to adaptive host poten- 
tial. Our analysis reveals the trajectory from specialized one- 
speed genomes (sects. Citrinae and Paspalorum) toward 
adaptive two-speed genomes (sects. Pusillae and Claviceps) 
through colocalization of TEs around predicted effectors 
and a putative loss of RIP resulting in tandem gene duplication 
coinciding with increased host range potential. 


Materials and Methods 

Sample Acquisition 

Field collected samples (Clav) were surfaced sterilized, allowed 
to grow as mycelia, and individual conidia transferred to make 
single spore cultures. Thirteen cultures were provided by Dr 
Miroslav Kolarik from the Culture Collection of 
Clavicipitaceae (CCC) at Institute of Microbiology, Academy 
of Sciences of the Czech Republic. Raw Illumina reads for 
samples (LM28, LM582, LM78, LM81, LM458, LM218, 
LM454, LM576, and LM583) were downloaded from NCBI 
SRA database. Raw Illumina reads from an additional 21 LM 
samples were generated by Dr Liu’s lab (AAFC), sequencing 
protocol of these 21 samples followed (Wingfield et al. 2018). 
Summarized information can be found in supplementary ta- 
ble $1, Supplementary Material online. 








Preparation of Genomic DNA 


Cultures grown on cellophane PDA plates were used for ge- 
nomic DNA extraction from lyophilized mycelium following a 
modified CTAB method (Doyle JJ and Doyle JL 1987; 
Wingfield et al. 2018) without using the RNase Cocktail 
Enzyme Mix, only RNase A was used. DNA contamination 
was checked by running samples on a 1% agarose gel and 
a NanoDrop One‘ (Thermo Fishcer Scientific). Twenty samples 
(7 Clav and 13 CCC) were sent to BGI-Hong Kong HGS Lab 
for 150-bp paired-end Illumina sequencing on an HiSeg 4000. 


Genome Assembly 


Preliminary data showed that raw reads of LM458 were con- 
taminated with bacterial DNA but showed strong species sim- 
ilar to Clav32 and Clav50. To filter out the bacterial DNA 
sequences, reads of LM458 were mapped against the assem- 
bled Clav32 and Clav50 genomes using BBSplit v38.41 
(Bushnell 2014). All forward and reverse reads mapped to 
each of the genomes were concatenated, respectively. Both 
sets were then interleaved to remove duplicates and used for 
further analysis. Reads for all 50 samples were checked for 
quality with FastQC v0.11.5 (Andrews 2010) and trimmed 


with Trimmomatic v0.36 (Bolger et al. 2014) using the com- 
mands (SLIDINGWINDOW: 4:20; MINLEN:36; HEADCROP: 10) 
to remove poor quality data, only paired-end reads were 
used. To better standardize the comparative analysis, all 50 
samples were subject to de novo genome assembly with 
Shovill v0.9.0  (https://github.com/tseemann/shovill; last 
accessed May 11, 2020) using SPAdes v3.11.1 (Nurk et al. 
2013) with a minimum contig length of 1,000 bp. 

The reference genomes of C. purpurea strain 20.1 
(SAMEA2272775), C fusiformis PRL 1980 
(SAMNO2981339), and C.  paspali RRC 1481 
(SAMNO2981342) were downloaded from NCBI. Proteins 
for C. fusiformis and C. paspali were not available on NCBI 
so they were extracted from GFF3 files provided by Dr Chris 
Schardl and Dr Neil Moore, University of Kentucky, corre- 
sponding to the 2013 annotations (Schardl et al. 2013) avail- 
able at http:/Avww.endophyte.uky.edu (last accessed March 
22, 2020). Reference genomes were standardized for com- 
parative analysis with our 50 annotated genomes, by imple- 
menting a protein length cutoff of 50 aa and removal of 
alternatively spliced proteins in C. fusiformis and C. paspali, 
only the longest spliced protein for each locus remained. 


Transposable Elements 


TE fragments were identified following procedures for estab- 
lishment of de novo comprehensive repeat libraries set forth 
in Coghlan et al. (2018), a brief summary is described below. 
The following steps were automated through construction of 
a custom script, TransposableELMT (https://github.com/ 
PlantDr430/TransposableELMT). Each of the 53 Claviceps ge- 
nome were used to create a respective repeat library using 
RepeatModeler v1.0.8 (Smit and Hubley 2015), 
TransposonPSI (Hass 2010), and long terminal repear (LTR) 
LTR_finder v1.07 (Xu and Wang 2007) on default settings. 
LTR_harvest v1.5.10 (Ellinghaus et al. 2008) was additionally 
run on default settings, and results were filtered with 
LTR_digest v1.5.10 (Steinbiss et a/. 2009) with an HMM 
search for Pfam domains associated with TEs; only candidates 
with domain hits were kept. Repeat libraries from these four 
programs were concatenated with all curated TEs from 
RepBase (Bao et al. 2015) and redundant sequences were 
removed using Usearch v11.0.667 (Edgar 2010) with a per- 
cent identity cutoff of >80%. TEs for each of the nonredun- 
dant libraries were classified using RepeatClassifier v1.0.8 
(Smit and Hubley 2015). RepeatMasker v4.0.7 (Smit et al. 
2015) was then used, on default settings with each assemble 
genome and its respective repeat library, to soft mask the 
genomes and identify TE regions. TE content was represented 
as the proportion of the genome masked by TE regions de- 
termined by RepeatMasker, excluding simple and low com- 
plexity repeats. 

The TE divergences, calculated from RepeatMasker for TEs 
in all 53 Clavicaos genomes, were used to plot the divergence 
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landscape using a custom script (https://github.com/ 
PlantDr430/CSU_scripts/blob/master/TE_divergence_land- 
scape.py). The RepeatMasker results were also used with 
the respective GFF3 file from each genome to calculate the 
average distance (kb) of each gene to the closest TE frag- 
ment on the 5’ and 3’ flanking side. Values were calculated 
for predicted effectors, noneffector secreted genes, non- 
secreted metabolite genes, and all other genes using a 
custom script — (https://github.com/PlantDr430/CSU_ 
scripts/blob/master/TE_closeness. py). 


Genome Annotation 


AUGUSTUS v3.2.2 (Mario et al. 2008) was used to create 
pretrained parameters files using the reference C purpurea 
strain 20.1, available expressed sequence tag (EST) data from 
NCBI, and wild-type RNAseq data (SRR4428945) created in 
Oeser et al. (2017). RNA-seq data was subject to quality check 
and trimming as above. All three data sets were also used to 
train parameter files for the ab initio gene model prediction 
software’s GenelD v1.4.4 (Blanco et al. 2007) and 
CodingQuarry v2.0 (Testa et al. 2015). GenelD training fol- 
lowed protocols available at http://genome.crg.es/software/ 
geneid/training.html. For CodingQuarry training, RNA tran- 
scripts were created de novo using Trinity v2.8.4 (Grabherr 
et al. 2011) on default settings and EST coordinates were 
found by mapping the EST data to the reference genome 
using Minimap2 v2.1 (Li 2018). 

Gene models for the 50 genomes were then predicted 
with GenelD and CodingQuarry using the trained 
C. purpruea parameter files. CodingQuarry prediction was 
also supplemented with transcript evidence by mapping the 
available EST and RNA-seq C. purpurea data to each genome 
using Minimap2. BUSCO v3 (Waterhouse et al. 2018) was run 
on all 50 genomes using the AUGUSTUS C. purpurea pre- 
trained parameter files as the reference organism and the 
Sordariomyceta database. The resulting predicted proteins 
for each sample were used as training models for ab initio 
gene prediction using SNAP (Korf 2004) and GlimmerHMM 
v3.0.1 (Majoros et al. 2004). Last, GeMoMa v1.5.3 
(Keilwagen et al. 2016) was used for ab initio gene prediction 
using the soft-masked genomes and the C. purpruea 20.1 
reference files. 

Funannotate v1.6.0 (Palmer and Stajich 2019) was then 
used as the primary software for genome annotation. 
Funannotate additionally uses AUGUSTUS and GeneMark- 
ES (Ter-Hovhannisyan et al. 2008) for ab initio gene model 
prediction, Exonerate for transcript and protein evidence 
alignment, and EVidenceModeler (Hass et al. 2008) for a final 
weighted consensus. All C. purpurea EST and RNAseq data 
were used as transcript evidence and the Uniport Swiss-Prot 
database and proteins from several closely related species 
(C. purpurea strain 20.1, C. fusiformis PRL1980, C. paspali 
RRC1481, Fusarium oxysporum f. sp. lycopersici 4287, 








Pochonia chlamydosporia 170, Ustilago maydis 521, and 
Epichloe festucae F1) were used as protein evidence. The 
AUGUSTUS pretrained C. purpurea files were used as 
BUSCO seed species along with the Sordariomyceta database 
and all five ab initio predictions were passed through the — 
other_gff flag with weights of 1. The following flags were also 
used in Funannotate “predict” : -repeats2evm, —optimize_au- 
gustus, —soft_mask 1000, —min_protlen 50. BUSCO was used 
to evaluate annotation completeness using the Dikarya and 
Sordariomyceta databases (odb9) with —prot on default 
settings. 


Functional Annotation 


Functional analysis was performed using Funannotate 
“annotate.” The following analyses were also performed on 
the three reference Claviceps genomes. Secondary metabolite 
clusters were predicted using antiSMASH v5 (Blin et al. 2019) 
with all features turned on. Functional domain annotations 
were conducted using eggNOG-mapper v5 (Huerta-Cepas 
et al. 2017, 2019) on default settings and InterProScan v5 
Vones et al. 2014) with the -goterms flag. Phobius v1.01 
(Kall et al. 2007) was used to assist in prediction of secreted 
proteins. In addition to these analyses Funannotate also per- 
formed domain annotations through an HMMer search 
against the Pfam-A database and dbCAN CAZYmes data- 
base, a BlastP search against the MEROPS protease database, 
and secreted protein predictions with SignalP v4.1 (Nielsen 
2017). 

For downstream analysis, proteins were classified as se- 
creted proteins if they had signal peptides detected by both 
Phobius and SignalP and did not possess a transmembrane 
domain as predicted by Phobius and an additional analysis of 
TMHMM v2.0 (Krogh et al. 2001). Effector proteins were 
identified by using EffectorP v2.0 (Sperschneider et al. 
2018), with default settings, on the set of secreted proteins 
for each genome. Transmembrane proteins were identified if 
both Phobius and TMHMM detected transmembrane 
domains. Secondary metabolite proteins were identified if 
they resided within metabolite clusters predicted by 
antiSMASH. Proteins were classified as having conserved pro- 
tein domains if they contained any Pfam or IPR domains. 


Gene Family Identification and Classification 


OrthoFinder v2.3.3 (Emms and Kelly 2019) was run on default 
settings using Diamond v0.9.25.126 (Buchfink et al. 2015) to 
infer groups of orthologous gene clusters (orthogroups) based 
on protein homology and Markov Cluster Algorithm (MCL) 
clustering. To more accurately place closely related genes into 
clusters an additional 78 fungal genomes (supplementary ta- 
ble $3, Supplementary Material online) with emphasis on 
plant associated fungi of the order Hypocreales were added. 
To standardize, all 78 additional genomes were subject to a 
protein length cutoff of 50 amino acids and genomes 
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downloaded from http:/Avww.endophyte.uky.edu had alter- 
natively spliced proteins removed. For downstream analysis, 
orthogroups pertaining to the 53 Claviceos genomes were 
classified as secreted, predicted effectors, transmembrane, 
metabolite, and conserved domain orthogroups if >50% of 
the Claviceps strains present in a given cluster had at least one 
protein classified as such. 


Phylogeny and Genome Fluidity 





Phylogenetic relationship of all 53 Claviceps genomes, with 
Fusarium graminearum, F. verticillioides, Epichloe festucae, 
and £. typhina as outgroups, was derived from 2,002 
single-copy orthologs obtained from our OrthoFinder defined 
gene clusters (described above). This resulted in a data set of 
114,114 amino acids sequences that were concatenated to 
create a supermatrix and aligned using MAFFT v7.429 (Katoh 
and Standley 2013) on default settings. Uninformative sites 
were removed using Gblocks v0.91 (Castresana 2000) on de- 
fault settings. Due to the large scale of the alignment maxi- 
mum likelihood reconstruction was performed using FastTree 
v2.1.11 (Price et al. 2010) using the Whelan and Goldman 
matrix model of amino acid substitution with the -gamma, — 
spr 4, —mlacc 2, -slownni, and -slow flag with 1,000 boot- 
straps. MEGA X (Sudhir et al. 2018) was used for neighbor 
joining (NJ) reconstruction using the Jones, Taylor, and 
Thorton matrix model of amino acid substitution with gamma 
distribution and maximum parsimony (IMP) reconstruction us- 
ing the tree bisection reconstruction (TBR) algorithm with 100 
repeated searches. Nodal support for both NJ and MP recon- 
structions were assessed with 1,000 bootstraps. In addition, 
an alignment and maximum likelihood (ML) reconstruction 
was performed on each of the 2,002 protein sequences fol- 
lowing the procedure as above (MAFFT, Gblocks, FastTree). A 
density consensus phylogeny was created from all gene trees 
using the program DensiTree v2.2.5 (Bouckaert and Heled 
2014). PhyBin v0.3-1 (Newton RR and Newton IL 2013) was 
used to cluster trees from three data sets (1: Claviceps genus 
without outgroups, 2: section Pusillae species, and 3: section 
Claviceps species) together to identify frequencies of concor- 
dant topologies using the -complete flag with —editdist = 2. 
To reduce noise, from abundant incomplete lineage sorting in 
section Claviceps, we implemented a —minbranchlen = 0.015 
for our Claviceps genus data set. 

Following methodologies established in Kislyuk et al. 
(2011) genomic fluidity, which estimates the dissimilarity be- 
tween genomes by using ratios of the number of unique gene 
clusters to the total number of gene clusters in pairs of 
genomes averaged over randomly chosen genome pairs 
from within a group on N genomes, was used to assess 
gene cluster dissimilarity within the Claviceps genus. For a 
more detailed description refer to Kislyuk et a/. (2011). Data 
sets containing gene clusters from representative members of 
section Pusillae, section Claviceps, Clavieps genus, and all 





C. purpurea strains were extracted from our OrthoFinder de- 
fined gene clusters. Additional species- and genus-wide gene 
cluster data sets from the additional 78 fungal genomes were 
extracted for comparative purposes. All section- and genus- 
wide data sets contained one representative isolate from each 
species to reduce phylogenetic bias. Each extracted data set 
was used to calculate the genomic fluidity using a custom 
script (https://github.com/PlantDr430/CSU_scripts/blob/mas- 
ter/pangenome_fluidity.py). The result files for each data set 
were then used for figure creation and two-sample two-sided 
z test statistics (Kislyuk et al. 2011) using a custom script 
(https://github.com/PlantDr430/CSU_scripts/blob/master/ 
combine_fluidity.py). 


Gene Density Compartmentalization 


A custom script (https://github.com/PlantDr430/CSU_scripts/ 
blob/master/genome_speed_hexbins.py) was used to calcu- 
late local gene density measured as 5/ and 3’ flanking distan- 
ces between neighboring genes (intergenic regions). To 
statistically determine whether specific gene types had longer 
intergenic flanking regions than all other genes within the 
genome we randomly sampled 100 each group of genes 
specific gene vs. other genes) 1,000 times for both the 5’ 
and 3’ flanking distances. Mann-Whitney U test was used to 
test for significance on all 2,000 subsets corrected with 
Benjamini-Hochberg. Corrected P values were averaged per 
flanking side and then together to get a final P value. Genes 
that appeared on a contig alone were excluded from analysis 
supplementary table S4, Supplementary Material online). For 
graphical representation, genes that were located at the start 
of each contig (5’ end) were plotted along the x axis, whereas 
genes located at the end of each contig (3’ end) were plotted 
along the y axis. 





RIP and Blast Analyses 


For all 53 genomes a self-BlastP v2.9.0+ search was con- 
ducted to identify best hit orthologs within each genome 
with a cutoff e-value of 10~° and removal of self-hits. This 
process was automated using a custom script (https://github. 
com/PlantDr430/CSU_scripts/blob/master/RIP_blast_analysis.py). 
We further examined if gene pairs with a pairwise identity of 
>80% were located next to each other and/or separated by 
five or fewer genes. Fifty-six important Claviceps genes (supple- 
mentary table S7, Supplementary Material online) including the 
rid-1 homolog (Freitag et al. 2002) were used in a BlastP analysis 
to identify the number of genes present that passed an e-value 
cutoff of 10~°, 50% coverage, and 35% identity. Genes that 
appeared as best hits for multiple query genes were only 
recorded once for their overall best match. In addition, the 
web-based tool The RiPper (Van Wyk et al. 2019) was used 
on default settings (1-kb windows in 500-bp increments) to 
scan whole genomes for presence of RIP and large RIP affected 
regions (LRARs). 
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Statistical Programs and Plotting 


Statistics and figures were generated using Python3 modules 
SciPy v1.3.1, statsmodel v0.11.0, and Matplotlib v3.1.1. 
Heatmaps were generated using ComplexHeatmap v2.2.0 
in R (Gu 2016). 


Results 


Genome Assembly and Annotation 


To provide a comprehensive view of variability across 
Claviceps, we sequenced and annotated 50 genomes (19 
Claviceps spp.), including C. citrina the single species of section 
Citrinae, six species belonging to section Pusillae, and 44 
genomes (12 species) belonging to section Claviceps, of which 
23 genomes belong to C. purpurea s.s. (table 1 and supple- 
mentary table $1, Supplementary Material online). The assem- 
blies and annotations were of comparable quality to the 
reference strains (table 1). A more detailed representation 
of the assembly and annotation statistics can be seen in table 1 
and supplementary figure S1 and table $2, Supplementary 
Material online. 

Overall, species of section Claviceos had better assemblies 
and annotations than species of other sections regarding con- 
tig numbers, N50’s, and BUSCO completeness scores (table 1). 
Nearly all species of section Claviceps showed higher BUSCO 
scores than the references, whereas species of sections 
Pusillae and Citrinae generally showed lower scores, likely 
due to their higher TE content (average 34.9 + 11.0%, ta- 
ble 1). Exceptions to the low BUSCO scores were C. digitariae 
and C. maximensis (sect. Pusillae), which had lower TE con- 
tent, 20.0% and 19.8%, respectively, than the rest of the 
species in section Pusillae (table 1). Although, C. africana 
(sect. Pusillae, TE content = 34.0%) also had comparable 
BUSCO scores, to the references, with a higher N50 and lower 
contig number, than the rest of the species in section Pusillae 
(table 1). Despite the differences in assembly quality between 
species of section Pusillae, the genomic findings reported in 
this study were found to be comparable between members of 
this section indicating that both higher quality and lower qual- 
ity genomes of section Pusiliae provided similar results. 


Phylogenomics and Genome Fluidity 


Orthologous gene clusters (orthogroups), which contain 
orthologs and paralogs, were inferred from protein homology 
and MCL clustering using OrthoFinder. Across the 53 
Claviceps isolates and outgroups species Fusarium graminea- 
rum, F. verticillioides, Epichloe festucae, and E. typhina, we 
identified 2,002 single-copy orthologs. We utilized a super- 
matrix approach to infer an ML species tree, based on these 
protein sequences. Results showed statistical support for four 
sections of Claviceps with a near concordant topology to the 
Bayesian five-gene phylogeny in Pichova et al. (2018). In 


addition, our topology of section Claviceps is concordant 
with a larger multilocus phylogeny of the section (Liu et al. 
2020). Our ML topology was also supported by NJ and max- 
imum parsimony supermatrix analyses (supplementary fig. S2 
and S3, Supplementary Material online). Notable exceptions 
were the placement of C. paspali (sect. Paspaforum) which 
grouped closer to C. citrina (sect. Citrinae) instead of section 
Claviceps, and C. pusilla which grouped closer to C. fusiformis 
instead of C. maximensis (fig. 1). We also found that section 
Claviceps diverged from a common ancestor with section 
Pusillae as opposed to section Paspalorum. Our results provide 
support for the deeply divergent lineages of sections Pusillae, 
Paspalorum, and Citrinae with a long divergent branch result- 
ing in section Claviceps (fig. 1). 

Each of the 2,002 single-copy orthologs were also inde- 
pendently aligned and analyzed in the same manner as our 
supermatrix phylogeny from representative isolates of each 
species. A density consensus tree of all 2,002 topologies 
was concordant with our supermatrix analysis but reveals ev- 
idence of incongruencies, particularly within section Claviceps 
(supplementary fig. S4, Supplementary Material online), 
which could be caused by biological, analytical, and sampling 
factors (Steenwyk et al. 2019). Although grouping of species 
generally held true to figure 1, variation was more related to 
the order of branches, with C. cyperi, C arundinis, 
C. humidiphila, and C. perihumidiphila showing the most var- 
iability. These results indicate the presence of some incon- 
gruencies within section Claviceps, section Pusillae, and 
across the genus (supplementary fig. S5-S7, Supplementary 
Material online) but a consensus supporting our ML species 
tree (fig. 1 and supplementary fig. S4, Supplementary 
Material online). There are several potential causes of these 
incongruencies that are currently the focal point of an ongo- 
ing study. 

To further elucidate trends of divergence within the genus, 
we examined genomic fluidity (Kislyuk et al. 2011) using all 
82,267 orthogroups from our previous OrthoFinder analysis. 
Genomic fluidity estimates the dissimilarity between genomes 
by using ratios of the number of unique orthogroups to the 
total number of orthogroups in pairs of genomes averaged 
over randomly chosen genome pairs from within a group on 
N genomes. For example, a fluidity value of 0.05 indicates that 
randomly chosen pairs of genomes in a group will on average 
have 5% unique orthogroups and share 95% of their 
orthogroups (Kislyuk et al. 2011). Section Claviceps, which is 
composed of 12 different species, showed a relatively small 
genomic fluidity (0.0619 + 0.0019) with limited variation, in- 
dicating pairwise orthogroup dissimilarity between randomly 
sampled genomes was quite low. The amount of variation 
between 12 different Claviceps species was similar to the var- 
iation between 24 C. purpurea s.s. isolates, however, the flu- 
idities were significantly different (P< 0.0001; supplementary 
table S5, Supplementary Material online). In comparison, the 
fluidity of section Pusillae (0.126+0.014; P<0.0001; 
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e Absence of RIP-ike 
mechanisms 

e Lower TE content 

e Evidence of tandem 
gene duplication 

e Increased host range 


e Evolution towards aspects of a 
two-speed genome. 

¢ Codocalization of TEs around 
predicted effector genes 

e Evidence of predicted effectors 
located in more gene-spares 
regions of the genome 


Claviceps purpurea Clav26 
Claviceps purpurea LM30 
Claviceps purpurea LM14 
Claviceps purpurea LM60 
Claviceps purpurea LM46 
Claviceps purpurea LM39 
Claviceps purpurea LM28 
Claviceps purpurea LM207 
Claviceps purpurea LM461 
Claviceps purpurea LM5 
Claviceps purpurea Clav46 
Claviceps purpurea LM223 
Claviceps purpurea LM33 
Claviceps purpurea LM233 
Claviceps purpurea LM232 
Claviceps purpurea LM474 
Claviceps purpurea LM469 
Claviceps purpurea LM4 
Claviceps purpurea LM71 
Claviceps purpurea 20.1 
Claviceps purpurea LM582 
Claviceps purpurea Clav55 
Claviceps purpurea LM470 
Claviceps purpurea Clav04 
Claviceps aff. purpurea Clav52 
Claviceps capensis CCC 1504 
Claviceps pazoutovae CCC1485 
Claviceps monticola CCC1483 
Claviceps quebecensis Clav50 
Claviceps quebecensis Clav32 
Claviceps quebecensis LM458 
Claviceps occidentalis LM77 
Claviceps occidentalis LM84 
Claviceps occidentalis LM78 
Claviceps ripicola LM220 
Claviceps ripicola LM218 
Claviceps ripicola LM219 
Claviceps ripicola LM454 
Claviceps spartinae CCC535 
Claviceps arundinis LM583 
Claviceps arundinis CCC1102 
Claviceps humidiphila LM576 
Claviceps perihumidiphila LM8 
Claviceps cyperiCCC1219 
Claviceps pusilla CCC602 
Claviceps fusiformis PRL 1980 
Claviceps lovelessii CCC647 
Claviceps digitariae CCC659 
Claviceps maximensis CCC398 
Claviceps sorghi CCC632 
Claviceps africana CCC489 
Claviceps citrina CCC265 
Claviceps paspali RRC 1481 





Epichloe typhina 


Epichloe festucae Tree scale x 0 05 





Fusarium verticillioides 


Fusarium graminearum 


Fic. 1.—ML phylogenetic reconstruction of the Claviceps genus using amino acid sequences of 2,002 single copy orthologs with 1000 bootstrap 
replicates. Pink dots at branches represent bootstrap values >95. Arrows and descriptions indicate potential changes in genomic architecture between 


Claviceps sections identified in this study. 


supplementary table S5, Supplementary Material online) was 
two times greater than the fluidity of section Claviceps and 
exhibited greater variation, indicating greater dissimilarities in 
orthogroups between randomly sampled species of section 
Pusillae. 


Overall, our ML phylogeny (fig. 1) and genome fluidity 
analysis (fig. 2) indicate a large evolutionary divergence sep- 
arating section Claviceps. Our subsequent analyses of the 
genomic architecture of all Claviceps species examine fac- 
tors that could be associated with the evolutionary 
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~~ Fusarium-Genus (16) 
—-— Epichioe-Genus (15) 
= Trichoderma-Genus (11) 
—-<— T.narazium-Species (6) 


= Pusillae-Section (7) 

== Claviceps-Genus (21) 
== F.oxysporum-Species (10) 
== Claviceps-Section (12) 
== C.purpurea-Species (24) 
= F-fujikuroi-Species (12) 


8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 


Number of genomes sampled 


Fic. 2.—Genomic fluidity (dashed lines) for specified groups within the order Hypocreales. Species level groups contain multiple isolates of a given 
species, whereas section and genus level groups contain one strain from representative species to remove phylogenetic bias. Shaded regions represent 
standard error and were determined from total variance, containing both the variance due to the limited number of samples genomes and the variance due 
to subsampling within the sample of genomes. Letters correspond to significant difference between fluidities determined through a two-sided two-sample z 
test (P< 0.05; supplementary table $4, Supplementary Material online). Legend is in descending order based on fluidity, and names are additionally 


appended to mean lines for clarity. 


divergence of section Claviceps and those driving cryptic 
speciation. 


TE Divergences and Locations 


Due to variation in sequencing platforms that generated the 
genome data, we examined the relationship of sequence 
quality with predicted TE content to test for potential biases. 
Results identified two clusters of genomes with differing se- 
quence qualities, which was determined to be a result of the 
sequencer used. Although these differences existed, analysis 
of each cluster showed a lack of relationship between se- 
quence quality and TE content (supplementary fig. S8, 
Supplementary Material online). In addition, section 
Claviceps samples were sequenced with both sequencers 
and results were highly comparable between these samples 
(reported below), indicating no sequence quality bias. 

TE divergence landscapes revealed an overrepresentation 
of LTR elements in sections Pusillae, Citrinae, and Paspalorum. 
All three sections showed a similar large peak of LTRs with 
divergences between 5% and 10% (fig. 3 and supplementary 
fig. S9, Supplementary Material online), indicating a relatively 
recent expansion of TEs. The landscapes of sections Pusillae, 
Citrinae, and Paspalorum are in striking contrast to species of 
section Claviceps that showed more similar abundances of 
LTR, DNA, LINE, SINE, and RC (helitron) elements. Species of 
section Claviceos showed broader peaks of divergence be- 
tween 5% and 30% but also showed an abundance of TEs 


with ~0% divergence suggesting very recent TE expansion 
(fig. 3 and supplementary fig. $9, Supplementary Material 
online). The TE landscape of C. cyperi showed a more striking 
peak of divergence between 5% and 10% that more closely 
resembled the TE divergences of sections Pusillae, 
Paspalorum, and Citrinae. However, the content of the TE 
peak in C. cyperi largely contained DNA, LINE, and unclassified 
TEs as opposed to LTR’s (supplementary fig. S9, 
Supplementary Material online). 

To identify where genes were located in relation to TEs, we 
calculated the average distance (kb) of each gene to the clos- 
est TE fragment. This analysis was performed for predicted 
effectors, secreted (noneffector) genes, secondary metabolite 
(nonsecreted) genes, and all other genes. Secreted genes and 
predicted effectors of sections Claviceos and Pusillae species 
were found to be significantly closer to TEs compared with 
other genes within each respective section (fig. 4; 
P<0.0001), suggesting that these genes could be located 
in more repeat-rich regions of the genome. It should be noted 
that we did observe a significant difference (P< 0.001, 
Welch's test) in TE content between section Pusillae (32.5 + 
9.59%) and section Claviceps (8.79 + 1.52%). In both sec- 
tions Claviceps and Pusillae, secondary metabolite genes were 
located farther away from TEs (fig. 4; P< 0.0001), that is, 
repeat-poor regions of the genome. These trends hold true 
for individual isolates, with a notable exception of C. pusilla 
(sect. Pusillae) showing no significant differences in the prox- 
imity of TEs to specific gene types (P> 0.12; supplementary 
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Fic. 3.—TE fragment divergence landscapes for representative species of each Claviceps section; C. purpurea 20.1 (sect. Claviceps), C. maximensis 
CCC398 (sect. Pusillae), C. paspali RRC 1481 (sect. Paspalorum), and C. citrina (sect. Citrinae). Stacked bar graphs show the nonnormalized sequence length 
occupied in each genome (y axis) for each TE type based on their percent divergence (x axis) from their corresponding consensus sequence. Landscape for all 
remaining isolates can be seen in supplementary figure S8, Supplementary Material online. 


fig. S10, Supplementary Material online). Variation existed in 
whether particular isolates had significant differences be- 
tween all other genes compared with secreted genes and 
secondary metabolite genes, but all species in sections 
Claviceos and Pusillae (aside from C. pusilla) had predicted 
effector genes located significantly closer to TEs (P< 0.003; 
supplementary fig. S10, Supplementary Material online). No 
significant differences in the proximity of TEs to specific gene 
types were observed in sections Citrinae and Paspalorum 
(fig. 4; P>0.11), suggesting that TE’s are more randomly 
distributed throughout these genomes. 


Gene Density Compartmentalization 


To further examine genome architecture, we analyzed local 
gene density measured as flanking distances between neigh- 
boring genes (intergenic regions) to examine evidence of 


gene density compartmentalization (i.e., clustering of genes 
with differences in intergenic lengths) within each genome. 
Results showed that all 53 Claviceps strains exhibited a one- 
compartment genome (lack of multiple compartments of 
genes with different intergenic lengths). Although, there 
was a tendency for more genes with larger intergenic regions 
in sections Claviceos and Pusillae compared with sections 
Citrinae and Paspalorum (fig. 5; supplementary fig. S11, 
Supplementary Material online). 

To further clarify evolutionary tendencies, we evaluated 
whether gene types showed a difference in their flanking 
intergenic lengths compared with other genes within their 
genomes. Results showed that predicted effector genes in 
section Claviceps had significantly larger intergenic flanking 
regions compared with other genes, indicating they may re- 
side in more gene-sparse regions of the genome (P< 0.04, 
fig. 5, supplementary fig. S11, Supplementary Material 
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Fic. 4.—Boxplot distributions of predicted effectors, secreted (noneffectors), secondary metabolite (nonsecreted) genes, and other genes (i.e., genes 
that are not effectors, secreted, or secondary [2°] metabolite genes) in Claviceps sections showing the mean distance (kb) of each gene to the closest TE 
fragment (5' and 3’ flanking distances were averaged together). Kruskal-Wallis (P value: *<0.05, **<0.01, ***<0.001, n.s. = not significant). Pairwise 
comparison was performed with Mann-Whitney U test with Benjamini-Hochberg multitest correction. Letters correspond to significant differences between 
gene categories within sections (P< 0.05). Plots for all individual isolates can been seen in supplementary figure S9, Supplementary Material online. 


online). Only C. digitariae and C._ lovelessi (P<0.01, LRARs covering 3,984 + 2,144 kb of their genomes, indicat- 
P=0.024, _ respectively; supplementary fig. $11, ing past or current activity of RIP-like mechanisms (fig. 6; sup- 
Supplementary Material online) of section Pusillae had pre- plementary tables S6-S8, Supplementary Material online). 
dicted effector genes with significantly larger intergenic This is further supported by an average GC content of 
regions than other genes, although C. fusiformis and 42.84 + 3.03% (table 1) in sections Pusillae, Citrinae, and 
C. pusilla were near significant (fig. 5, P=0.054, P=0.056, Paspalorum, which is on average 8.81% lower than in section 
respectively; supplementary fig. S11, Supplementary Material Claviceps that shows an absence of RIP (reported below). The 
online). Flanking intergenic lengths of secreted genes also presence of RIP-like mechanisms in sections Pusillae, Citrinae, 
showed larger intergenic lengths and were often significantly and Paspalorum was unexpected, given the abundance of TEs 
larger than other genes in section Claviceps (fig. 5; supple- within genomes of these sections (table 1, fig. 3, and supple- 
mentary fig. $11, Supplementary Material online). In contrast, mentary fig. S9, Supplementary Material online) as RIP-like 
secondary metabolite genes exhibited a widespread distribu- mechanisms should be working to silence and inactivate these 
tion of intergenic lengths that were not significantly different TEs. Although we did not directly test the activity of TEs within 
than other genes in all 53 Claviceps strains (P> 0.55, fig. 5; our genomes, due to lack of RNAseq data, the peaks of low 
supplementary fig. S11, Supplementary Material online). TE nucleotide divergence (<10%) in sections Pusillae, 
Citrinae, and Paspalorum (fig. 3, supplementary fig. S9, 
RIP Analysis Supplementary Material online) suggest recent activity of 
To test for effects of RIP-like signatures, we assessed the bi- TEs (Frantzeskakis et al. 2018). 
directional similarity of genes against the second closest BlastP In comparison, species in section Claviceps lack rid-7 homo- 
match within each isolate’s own genome (Galagan et al. logs, showed larger amounts of gene similarity, and a general 
2003; Urguhart et al. 2018), supported by a BlastP analysis lack of evidence of RIP-like signatures with only 0.13 + 
against the rid-1 RIP gene of Neurospora crassa, and calcula- 0.03% of their genomes putatively affected by RIP, and a 
tions of RIP indexes in 1-kb windows (500 bp increments) us- mean RIP composite index of —0.59 + 0.01 suggesting that 
ing The RiPper (Van Wyk et al. 2019). Results showed that RIP-like mechanisms are inactive (fig. 6 and supplementary 
sections Pusillae, Citrinae, and Paspalorum had homologs of tables S6-S8, Supplementary Material online). Gene pairs 
rid-1, fewer genes with close identity (>80%), on average sharing a>80% identity to each other were often located 


27.4 + 11.4% of their genomes affected by RIP, a mean near each other. On average 27.02 + 5.91% of the pairs 
RIP composite index of —0.03+0.21, and 325+ 138 were separated by five or fewer genes, and 15.95 + 3.50% 
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Fic. 6.—Representative isolates of each Claviceps species showing the fraction of Blast hits at a given % identity (y axis) within each isolate (z axis) at a 
given percent identity (x axis) from the second closet BlastP match of proteins within each isolate’s own genome. Two C. purpruea s.s. isolates are shown to 


compare a newly sequenced genome versus the reference. 


of the pairs were located next to each other, indicating signs of 
tandem gene duplication within the section (supplementary 
table S6, Supplementary Material online). C. cyperi showed 
the smallest proportions of highly similar tandem genes 
(7.77% and 5.7%) compared with other species within sec- 
tion Claviceps. Additional variations in the proportions of 
highly similar tandem genes between other species of section 
Claviceps were not evident as these proportions appeared to 
vary more between isolate than species (supplementary table 
S6, Supplementary Material online). 


Gene Cluster Expansion 


The proteome of Claviceos genomes were used to infer 
orthologous gene clusters (orthogroups) through protein ho- 
mology and MCL clustering using OrthoFinder. Our results 
revealed evidence of orthogroup expansion within section 
Claviceps as species contained more genes per orthogroup 
than species of the other three sections (supplementary fig. 
S12, Supplementary Material online). To identify the types of 
gene clusters that were showing putative expansion, we fil- 
tered our clusters by following two criteria: 1) at least one 
isolates had two or more genes in the orthogroup and 2) 
there was a significant difference in the mean number of 
genes per orthogroup between all 44 isolates in section 
Claviceps and the 9 isolates from sections Pusillae, Citrinae, 
and Paspalorum (a < 0.01, Welch's test). 

Overall, we identified 863 (4.7%) orthogroups showing 
putative expansion. We observed extensive expansion 
(orthogroups with observations of greater than or equal to 
ten genes per isolate) present in many unclassified, predicted 


effectors, secreted (noneffector) orthogroups, and 
orthogroups encoding genes with conserved domains (fig. 7 
and supplementary figs. S13 and S14, Supplementary 
Material online). Transmembrane orthogroups also showed 
evidence of expansion with several isolates having five to 
ten genes. Orthogroups with secondary metabolite genes 
showed the lowest amount of expansion (supplementary 
fig. S15, Supplementary Material online). Overall, section 
Claviceps showed expansion in a greater number of 
orthogroups than section Pusillae, Citrinae, and Paspalorum 
in all categories except transmembranes (supplementary fig. 
S15, Supplementary Material online). Orthogroups with an 
average greater than or equal to five genes per isolate, within 
section Claviceps, contained a variety of functional proteins, 
with generally more proteins encoding protein/serine/tyrosine 
kinase domains (supplementary table S9, Supplementary 
Material online). Additional details can be obtained from sup- 
plementary tables S10 (ordered orthogroups corresponding 
to heatmaps; fig. 7 and supplementary figs. $13 and S14, 
Supplementary Material online), S11-1, and $11-2, 
Supplementary Material online (orthogroups identification 
and functional annotation of all proteins). 

Within section Claviceps patterns of gene counts per 
orthogroup appeared to break down and contain variations 
in the number of genes per orthogroups with some presence/ 
absences occurring between isolates and species. Notably, 
C. cyperi (CCC1219) showed the lowest amount of expan- 
sion, across all taxa, in comparison with other species of sec- 
tion Claviceps. In addition, C. spartinae (CCC535), C. capensis 
(CCC1504), C monticola (CCC1483), C pazoutovae 
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Fic. 7.—Heatmap of gene counts in orthogroups for all 53 Claviceps strains ordered based on ML tree in figure 1 and separated by sections. 
Orthogroups are separated based on their classification and are only represented once (i.e., secondary [2°] metabolite orthogroups shown are those 
that are not already classified into the effector or secreted orthogroups) and are ordered based on hierarchical clustering, see supplementary table S9, 
Supplementary Material online, for list of orthogroups corresponding to the order shown in the heatmaps. The host spectrum (right) is generalized across 
species, as no literature has determined the existence of race specific isolates within species, is shown on the left side of the figure determined from literature 
review of field collected samples (Supplementary Material in Pichova et al. 2018) and previous inoculation tests Campbell (1957) and Liu et al. (2020). For 
heatmap of conserved domains, see supplementary figure $12, Supplementary Material online, and for unclassified gene families, see supplementary figure 


S13, Supplementary Material online. 


(CCC1485), C occidentalis (LM77, 78, 84), and 
C. quebecensis (LM458, Clav32, 50) also showed lower ex- 
pansion (fig. 7, supplementary figs. S13 and S14, 
Supplementary Material online). However, no patterns were 
observed linking the variation in expansions with the literature 
determined host range of different species within section 
Claviceps. 


Discussion 


Our comparative study of 50 newly annotated genomes from 
four sections of Claviceps has provided us with an enhanced 
understanding of evolution in the genus through knowledge 
of factors associated with its diversification. Our results have 
revealed that despite having nearly identical life strategies, 
these closely related species have substantially altered geno- 
mic architecture and plasticity, which may drive genome ad- 
aptation. One key difference we observe is a shift from 
aspects that are characteristic of a one-speed genome (i.e., 


less adaptable) in narrow host-range Claviceps species (sects. 
Citrinae and Paspalorum) toward aspects that are character- 
istic of a two-speed genome (i.e., more adaptable) in broader 
host-range lineages of sections Pusillae and Claviceps (fig. 1; 
Dong et al. 2015; Frantzeskakis et al. 2019). 

The oldest divergent species of the genus (Pichova et al. 
2018), C citrina (sect. Citrinae) and C. paspali (sect. 
Paspalorum), are characterized by a proliferation of TEs, par- 
ticularly LTRs, which do not appear to be colocalized around 
particular gene types (fig. 4). Coupled with a lack of large- 
scale genome compartmentalization (fig. 5), these two spe- 
cies can be considered to fit with aspects of a one-speed 
genome which are often considered to be less adaptable 
and potentially more prone to being purged from the biota 
(Dong et al. 2015; Frantzeskakis et al. 2019). This could help 
explain the paucity of section lineages and restricted host 
range to one grass tribe, as similar patterns of large genome 
size, abundant TE content, and equal distribution of TEs has 
been observed in the specialized barley pathogen Blumeria 
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graminis f.sp. horde (Frantzeskakis et al. 2018). Although, 
rapid adaptive evolution within B. graminis f.sp. hordei, has 
been suggested to occur through copy-number variation and/ 
or heterozygosity of effector loci (Dong et al. 2015; 
Frantzeskakis et al. 2018, 2019). Our results show a lack of 
gene duplication occurring in sections Citrinae and 
Paspalorum likely due to the presence of RIP-like mechanisms. 
However, even with the presence of RIP-like mechanisms, 
there was a high LTR content in these species (fig. 3). This 
suggests that these LTR elements have found a way to avoid 
RIP-like mechanisms or indicate that these species harbor a 
less active version of an RIP-like mechanisms as is found in 
several fungal species (Kachroo et al. 1994; Nakayashiki et al. 
1999: Graia et al. 2001; Ikeda et al. 2002; Chalvet et al. 2003; 
Kito et al. 2003). Nonetheless, due to the high abundance of 
TEs (fig. 4) and presence of RIP (fig. 6 and supplementary 
tables S6 and S7, Supplementary Material online), we hypoth- 
esize that aspects of RIP-like “leakage” could be a likely mech- 
anism for evolution in C. citrina and C. paspali (and similarly 
sect. Pusillae) as has been shown to occur in other fungi (Fudal 
et al. 2009; Van de Wouw et al. 2010; Hane et al. 2015). It 
should be noted that since the estimated divergence of sec- 
tion Citrinae 60.5 Ma (Pichova et al. 2018), it has remained 
monotypic. It was only recently that unknown lineages of 
section Paspalorum were identified (Oberti et al. 2020), al- 
though these lineages were found on the same genera of 
host as C. paspali (Paspalum spp.) supporting our hypothesis 
that species within section Paspalorum have restricted host 
ranges. These recent findings further suggest that lack of ad- 
ditional lineages within these sections could be due to limited 
records of Claviceps species in South America, where the ge- 
nus is thought to have originated (Pichova et al. 2018). 
Further research into South American populations of 
Claviceps will provide significant insight into the evolution of 
these two sections. 

Members of section Pusillae also exhibited a proliferation of 
TEs, however, as this section diverged from sections Citrinae 
and Paspalorum, the genomic architecture evolved such that 
TEs colocalized around predicted effector genes (fig. 4). This 
proximity of TEs to effectors persisted in section Pusillae spe- 
cies (except C. pusilla; supplementary fig. S10, Supplementary 
Material online) and section Claviceps species potentially 
resulting in the large intergenic regions flanking predicted 
effector genes (fig. 5, supplementary fig. $11, 
Supplementary Material online). Together, these genomic 
alterations indicate aspects of a two-speed genome (Dong 
et al. 2015; Moller and Stukenbrock 2017). These observed 
genomic changes may have influenced the divergence and 
adaptability of sections Pusillae and Claviceps (fig. 1) similar to 
what has been observed in other fungi (Raffaele and Kamoun 
2012; Stukenbrock 2013; Moller and Stukenbrock 2017) and 
has been proposed to promote genomic flexibility and drive 
accelerated evolution of these genome compartments 
(Raffaele et al. 2010; Rouxel et al. 2011; de Jonge et al. 2013; 


Faino et al 2015, 2016; Seidl et al. 2015). Despite the number of 
studies that suggest this role of TEs in genome evolution, 
there has been limited evidence for the mechanism by which 
TEs drive evolution in filamentous pathogens. However, stud- 
ies incorporating improved genome assemblies of multiple 
individuals of a species along with transcriptome data have 
been able to demonstrate that transcriptionally active TEs 
were observed in lineage-specific regions of the plant patho- 
gen Verticillium dahliae (Amyotte et al. 2012; Faino et al. 
2016), resulting in genomic diversity through large scale dupli- 
cations in these lineage-specific regions (Faino et al. 2016). 
This also lead to the frequent loss of the effector Ave7 in 
populations of V. dahliae, which is located in a TE-rich line- 
age-specific region (de Jonge et al. 2012). 

Although we did not have transcriptome data to determine 
how many of the TEs are transcriptionally active, our data do 
show that most of the repetitive elements in section Claviceps 
species have very low nucleotide divergence (<1%) com- 
pared with TEs in sections Pusillae, Paspalorum, and Citrinae 
(5-20% nucleotide divergence; fig 3), suggesting a recent 
section specific expansion of TEs that are associated with a 
recent host range and geographic expansion and proliferation 
of recently described cryptic species (Liu et al. 2020) within 
section Claviceps. Similar observations placing TE bursts 
around speciation times have been reported in the plant path- 
ogen Leptosphaeria maculans (Rouxel et al. 2011; 
Grandaubert et al. 2014), and the grass-infecting (Blumeria 
spp.) and dicot-infecting (Erysiphe spp.) powdery mildews 
(Frantzeskakis et al. 2018). Theoretical models have proposed 
that repeated changes in phenotypic optimum in a dynamic 
fitness landscape may induce explosive bursts of transposon 
activity associated with faster adaptation (Startek et al. 2013). 
However, long-term maintenance of transposon activity is un- 
likely, and this may contribute to significant variation in the TE 
copy number among closely related species. Our findings that 
the variation in TE copy number between species in the genus 
Claviceps fits this pattern and call for future studies to clarify 
the relationship between TE expansion and changes in host 
range, geographic distribution, and cryptic speciation. 

Furthermore, our analyses revealed that a key difference 
between section Claviceps and section Pusillae is a putative 
loss of RIP-like mechanisms (figs. 1, 6 and supplementary ta- 
ble S7, Supplementary Material online). In the absence of RIP- 
like mechanisms, the gene-sparse regions rich in TEs, and 
effectors could be hot spots for duplication, deletion, and 
recombination (Galagan et al. 2003; Galagan and Selker 
2004; Raffaele and Kamoun 2012; Dong et al. 2015; Faino 
et al. 2016; Moller and Stukenbrock 2017; Frantzeskakis et al. 
2018, 2019). This would explain the observations of tandem 
gene duplication within the section (figs. 6, 7 and supplemen- 
tary table S6, figs. $12-S15, Supplementary Material online), 
which may facilitate rapid speciation, as has been postulated 
in several smut fungi (Kamper et al. 2006; Schirawski et al. 
2010; Dutheil et al. 2016). In fact, C. cyperi, a species of 
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section Claviceps and thought to be ancestral from ancestral 
state reconstructions of host range (Pichova et al. 2018), 
showed the least amount of gene cluster expansion and tan- 
dem duplication (fig. 7 and supplementary table S6, figs. $13 
and $14, Supplementary Material online), indicating that gene 
duplication may be contributing to the divergence of new 
species, as other species in section Claviceps have increased 
genome size, gene count, and number of closely related gene 
pairs (>80% identity) (table 1 and supplementary table S6, 
Supplementary Material online). It is unclear if these changes 
in gene duplication rate are a selective or neutral mutational 
process. Because the increased occurrence of gene duplica- 
tion within section Claviceps is likely a result of a loss of RIP-like 
mechanisms, it is more plausible to suggest that the change in 
propensity for gene duplication was a neutral process. 
However, our evidence of effector duplications suggests 
that this change in propensity may have allowed an increase 
chance for future adaptive events. Within section Claviceps 
gene duplication is likely facilitated by recombination events 
during annual sexual reproduction (Esser and Tudzynski 
1978). Future studies on recombination will be critical to 
our understanding of the mechanisms driving gene duplica- 
tion and elucidating factors associated with the observations 
of potential incomplete lineage sorting (Pease and Hahn 
2013) within the section. 

Substantially altered genomic architecture and _ plasticity 
between Claviceps sections was observed in this study, yet it 
is unclear whether the evolution of these genomes were 
caused by contact with new hosts and different climates as 
ancestral lineages migrated out of South America (Pichova 
et al. 2018) or if the evolution toward aspects of a two- 
speed genome provided an advantage in adapting to new 
hosts or environments. Further research is needed to clarify 
this point. As sections Pusillae and Claviceps have larger host 
ranges (5 tribes and 13 tribes, respectively) and increased 
levels of speciation (Pichova et al. 2018), they represent ideal 
systems to test this hypothesis. It is postulated that section 
Pusillae was transferred to Africa (ca. 50.3 Ma), whereas sec- 
tion Claviceps originated in North America (ca. 20.7 Ma), and 
it is likely that the common ancestor shared between these 
sections (fig. 1) had strains that were transferred to Africa 
ikely due to insect vectors via transatlantic long-distance dis- 
persal (Pichova et al. 2018). The strains that remained, in 
South America, likely persisted but appeared to not speciate 
for roughly 30 Ma (Pichova et al. 2018), despite having 
aspects of a more adaptable two-speed genome (figs. 4, 5). 
Limited sampling records could be a factor contributing to this 
ack of speciation during this 30 Myr period, but it could also 
be suggested that the ancestral species of sections Claviceps 
did not diverge due to a lack of diversification of host species 
(Pichova et al. 2018). It is well known that Claviceps species 
share a rather unique relationship with their hosts (strict ovar- 
ian parasites). The evolution of the Claviceps genus appears to 
be primarily driven by the evolution and diversification of the 





host species (Pichova et al. 2018). This can be inferred from 
divergence time estimates which show that the crown node 
of section Pusillae aligns with the crown node of PACMAD 
grasses (ca. 45 Ma) (Bouchenak-Khelladi et al. 2010; Pichova 
et al. 2018), suggesting that these two organisms radiated in 
tandem after ancestral strains of section Pusillae were trans- 
ferred to Africa. Similarly, the estimated crown node of sec- 
tion Claviceps corresponds with the origin of the core 
Pooideae (Poeae, Triticeae, Bromeae, and Littledaleae), which 
occurred in North America (ca. 33-26 Ma) (Bouchenak- 
Khelladi et al. 2010; Sandve and Fjellheim 2010). 

Such a large difference between the estimate divergence 
age (~30 Myr) and long divergence branch (fig. 1) between 
section Clavcieps and the other three sections (Pichova et al. 
2018) could suggest that a sudden event sparked the adaptive 
radiation within this section (fig. 1). Under an assumption that 
ancestral strains of section Claviceps were infecting sedges 
(Cyperaceae), as is seen in the ancestral C. cyperi (Pichova 
et al. 2018), a host jump to BOP grasses could have ignited 
the rapid speciation of section Claviceps, similar to the sug- 
gested tandem radiation of section Pusillae with the PACMAD 
grasses in Africa. However, unknown factors might be re- 
sponsible for the drastic genomic changes (i.e., putative loss 
of RIP-like mechanisms) observed in section Claviceos, as no 
such changes were observed in section Pusillae. The radiation 
of the core Pooideae occurred after a global supercooling 
period (ca. 33-26 Ma) in North America. During this period, 
Pooideae experienced a stress response gene family expansion 
that enabled adaptation and diversification to cooler, more 
open, habitats (Kellogg 2001; Sandve and Fjellheim 2010). As 
gene cluster expansion was observed in section Claviceps (the 
only section to infect BOP grasses), it suggests that the same 
environmental factors that caused the radiation of Pooideae 
could have similarly affected section Claviceps (Kondrashov 
2012) and might have resulted in the host jump to 
Pooideae, and potentially other BOP tribes. Interestingly, 
one of the orthogroups significantly expanded in section 
Claviceps (OG0000016) contains proteins associated with a 
cold-adapted (Alias et al. 2014) serine peptidase S8 subtilase 
(MERO047718; S08.139) (supplementary table S9, 
Supplementary Material online). Although the crown node 
of section Claviceps is estimated at ~5-10 Myr before the 
radiation of the core Pooideae, the 95% highest posterior 
density determined in Pichova et al. (2018) could indicate 
both radiation events occurred at similar times. 

Further examination of Claviceps species in South and 
Central America needs to be conducted to better elucidate 
the evolution and dispersal of the genus (Pichova et al. 2018). 
Efforts should focus on the elusive C. junci, a pathogen of 
Juncaceae (rushes), which is thought to reside in section 
Claviceps based on morphological and geographic character- 
istics (Langdon 1952; Pichova et al. 2018). This species, and 
potentially others, will provide further insight into the early 
evolution of section Claviceps and could bridge the current 
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gap between the environmental factors that sparked the ra- 
diation of the core Pooideae and section Claviceps. Last, it 
would be interesting to examine if other phytopathogenic 
fungal species that diverged in North America ~20 Ma expe- 
rienced similar genomic alterations and host range 
expansions. 
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Supplementary data are available at Genome Biology and 
Evolution online. 
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